EU A UNIVERSITY OF 
COPENHAGEN (8) 
@ 


LARA: an extensible open source platform 
for learning languages by reading 


Branislav Bédi', Matt Butterweck?, Cathy Chua’, Johanna Gerlach’, 
Birgitta Bjérg Gudmarsdéttir’, Hanieh Habibi®, Bjartur Orn Jonsson’, 
Manny Rayner’, and Sigurdéur Vigfisson’ 


Abstract. Learning and Reading Assistant (LARA) is an open source platform that 
enables conversion of plain texts into an interactive multimedia form designed to 
support second- and foreign-language (L2) learners. In this workshop, we illustrate the 
open source aspects using collaborative work carried out during a six-week summer 
project at the Arni Magnusson Institute for Icelandic Studies. Three undergraduate 
level students extended the platform in different directions in cooperation with other 
members of the international LARA team. The three subprojects were respectively 
concerned with adding automatically generated flashcards, adding multimedia 
versions of poetic texts in the archaic language Old Norse, and extending LARA to 
allow the inclusion of sign language content in Icelandic sign language — {slenskt 
TaknMal (fTM). All three reached successful conclusions. 
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1 Introduction 


LARA” (Akhlaghi et al., 2019) is a collaborative open source!’ project, active 
since mid-2018, whose goal is to develop tools that enable conversion of plain 
texts into an interactive multimedia form designed to support development of L2 
language skills by reading. The basic approach is in line with Krashen’s (1982) 
influential theory of input, suggesting that language learning proceeds most 
successfully when learners are presented with interesting and comprehensible L2 
material in a low-anxiety situation. LARA implements this abstract programme by 
providing concrete assistance to L2 learners, making texts more comprehensible 
to help them develop their reading, vocabulary, and pronunciation skills. In 
particular, LARA texts include translations and human-recorded audio attached 
to words and sentences, and a personalised concordance constructed from the 
learner’s reading history. The learner, just by clicking or hovering over a word, 
is always in a position to answer three questions: what does it mean, what does it 
sound like, and where have I seen it before? Figure | shows an example. 


Related platforms, from which we have adapted some ideas, include Learning With 
Text!’ and Clilstore'’. LARA, however, offers considerably more functionality. 
In particular, generation of learner-specific concordances is, as far as we know, 
unique to LARA. The LARA tools are made available through a free portal, 
divided into two layers. The core LARA engine consists of a suite of Python 
modules, which can also be run stand-alone from the command line. These are 
accessed through a web layer implemented in PHP‘. There is comprehensive 
online documentation’. 


In this paper, we will concentrate on the open source aspects. We illustrate this 
using work carried out during a six-week summer project at the Arni Magnusson 
Institute for Icelandic Studies, where three BA students, who had previously not 
worked with LARA, extended the platform in different directions with some 
assistance from other members of the LARA team. The three subprojects were 
respectively concerned with adding automatically generated flashcards; creating 
multimedia versions of poetic texts written in the archaic language Old Norse; and 
extending LARA to allow inclusion of sign language content. In sections 2 to 4, we 
briefly sketch the three subprojects. The final section summarises and concludes. 


10. https://www.unige.ch/callector/lara/ 

11. https://sourceforge.net/projects/callector-lara/ 

12. https://sourceforge.net/projects/lwt/ 

13. http://multidict.net/clilstore/ 

14. https://www.php.net/ 

15. https://www.issco.unige.ch/en/research/projects/callector/LARADoc/build/html/index.html 
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Figure 1. Example of a LARA document: Le petit prince. The user navigates 
using the controls at the top (1); the text is in the upper pane, clicking 
on a word displays information about it in the lower pane; here, the 
user has just clicked on part of the multiword i/ y a (‘there is’) (2), 
showing an automatically generated concordance (4); hovering the 
mouse over a word plays audio and shows a popup translation; clicking 
on a loudspeaker plays audio for the preceding sentence (3); the back- 
arrows (5) link each line in the concordance to its context of occurrence; 
a link to the document can be found on the LARA content page'® 


<> a Pp Table of Contents ik (1) 


Ca ne pouvait pas m’étonner beaucoup. ( Je savais-bien-qu’en dehors des grosses plapates comme la Terre, Jupiter, 
Mars, Vénus, auxquelles on a donné des noms, Wty oe Adarcoanins d autres Tr42 Jyueiguefois si petites qu’on 
a beaucoup de mal a les apercevoir au télescope. (J@ Quand un astronome découvre lune d’elles, il lui donne pour 
nom un numéro. @@ Il l'appelle par exemple : « l’astéroide 3251.» Q@ < 


J'ai de sérieuses raisons de croire que la planéte doi venait le petit prince est l’astéroide B 612. @ Cet astéroide n’a 
été apercu qu'une fois au télescope. en 1909. par un astronome ture. 


il y avoir ~ @) 


+ Eten effet. sur la planéte du petit prince, il y avait comme sur toutes les planétes, de bonnes herbes et de mauvaises 
herbes. 


= Or il y avait des graines terribles sur la planéte du petit prince... c’étaient les graines de baobabs. 0 


=— Illy avait, sur une étoile, une planéte, la mienne, 1a Terre, un petit prince a consoler ! 0 


+ -lln’y a pas de tigres sur ma planéte, avait objecté le petit prince, et puis les tigres ne mangent pas l"herbe. 


+ -+Mais il n’y a personne a juger ! - On ne sait pas, lui dit le roi. 6) 
- hry ena qu'un. — 

+ Il n’y arien a comprendre, dit l'allumeur. La consigne c’est la consigne. Bonjour. 

+ Ici c’est le désert. Il n’y a personne dans les déserts. 


+ Illy a une fleur... je crois qu'elle m’a apprivoisé... 


+ Il me semblait méme qu’il n’y ett rien de plus fragile sur la Terre. 


16. https://www.unige.ch/callector/lara-content/ 
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2. Extending LARA with flashcards 


A common suggestion we have received from LARA users is that it would be useful 
to make the platform more interactive and include functionality that allows learners 
to test their understanding of a text. The first subproject, carried out by a student 
who had just completed a Bachelor of Science in computer science, addressed 
this idea by adding capabilities to create flashcards automatically extracted from 
a LARA text. A member of the core LARA team first wrote a toy version of the 
flashcard module in Python, showing how to extract the necessary information 
from the internalised form. The student then worked autonomously, except for 
a couple of requests for low-level functions to obtain other types of internal 
information. Finally, the flashcard module was incorporated into the web layer by 
another member of the core LARA team, working together with the student. 


The final version of the flashcard module supports five different kinds of flashcards. 
A new set of flashcards is generated from the text each time the functionality is 
accessed; the choice of examples is random, but the “distractors’ (incorrect answers) 
are constrained so that they are made as similar as possible to the correct answer. An 
example of the simplest kind of flashcard is shown in Figure 2. Examples of other kinds 
of flashcards are shown in the sections on Old Norse and sign language further down. 


Figure 2. Example of LARA flashcard for Le petit prince. The student has to pick 
the most appropriate translation for the word presented at the top, ‘Vu’, 
out of the four alternatives. The context for the word is presented in 
both text and audio form 


Vu 
é& 
aud 
« J'ai vu une belle maison en briques roses, avec des géraniums aux fenétres et des colombes sur le toit... » 


> 0:00/0:07 ® : 


Seen 

Be able 
Been 
Been able 


© Flashcard 1 S 


3. Using LARA for Old Norse 


In most countries, students at middle schools are required to read classic works 
of literature that play an important part in the relevant country’s cultural history: 
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English children read Shakespeare, French children Moliére, etc. The archaic 
language of the texts is, in general, not fully comprehensible to the students 
without some explanation. Our second subproject was designed to see if LARA 
could provide assistance in this kind of situation. In Iceland, the appropriate culture 
referent is the Poetic Edda, a poem-cycle first written down in the late 13th century, 
but composed earlier. The Edda is written in Old Norse, the language spoken in 
Iceland between the 8th and 14th centuries, from which Modern Icelandic has 
developed. Old Norse is much closer to Modern Icelandic than English is to Old 
English, but still displays substantial differences: the grammar is not exactly the 
same, many words have shifted in meaning or have different spellings, and some 
have disappeared. 


Figure 3. Example of LARA document in Old Norse (Véluspa) showing 
navigation controls (1); recorded audio for each verse (2); text in both 
original (3) and modern orthography (4); words in red are linked to 
informative notes (5); clicking on a word, here skein, displays the 
information page for the lemma, here skina (6); runic symbol (7) 
displays translation of verse; automatically generated links (8) to online 
resources; automatically generated concordance (9); list of notes (10); 
frequency and alphabetical indexes (11); hovering the mouse over a 
word plays audio and shows a popup translation (not featured) 


<= & PD Table of Contents |, (1) 
> 0:00/0:15 —— ©: < (2) 
v.Vsp.4 > fa» 
Aér Burs synir @ Adg bvrs | synir 3) 
bjodum um yppdu, 6) biodom um ypd3o 
peir er Midgard peir er mid gard 
meran skopu; ~~ moran scopo. 
sol $kein/sunnan {6) sol scein | suNan 
a salar steina, a salar steina 
pa var grund groin se ba var grvnd groin 
Lareenum lauki. ¥ +} (7) ronom lauki 
Pm 0:00/0:22; ——— 4) 
. 
skina ; 
G inf . 


%@) 
Lexi J a, 
—|Aér Burs synir bj68um um yppéu, peir er Midgard meran skopu; sol skein sunnan a salar 
steina, pa var grund groin array lauki. P 

9 


—|Surtr ferr sunnan med sviga levi, skinn af sveréi sol valtiva; grjotbjérg gnata en gifr rata, 
‘troda halir helveg en himinn klofnar. ¥ 


My 
‘oles + o> 
Frequency index 


Ci» 
Alphabetical index « aaa 
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The student responsible for the subproject, who had just completed her second 
year of a BA degree in Icelandic, used the LARA portal to create versions of 
the two best-known Edda poems, Véluspa and Havamal; in contrast to the other 
subprojects, this did not involve developing any new platform functionality. 


The poems are annotated with glosses in Modern Icelandic, and read with 
adapted Old Norse pronunciation. Key words and phrases, most often names of 
gods and places, are linked to explanatory notes. An interesting aspect concerns 
kennings, poetic phrases typically of two or three not necessarily contiguous words 
characteristic of the Edda and related Old Norse poems. These could successfully 
be handled by the multiword annotation scheme illustrated in Figure |, a use of 
this mechanism we had not anticipated. An example of a page from an Edda text 
is shown in Figure 3, and a ‘sentence with gap’ flashcard in Figure 4. This work is 
described at greater length in Bédi et al. (2020, this volume). 


Figure 4. Example of LARA flashcard for Voluspd. The student has to find the 
missing word in the incomplete verse presented at the top; after they 
have answered correctly, they can listen to the whole verse using the 
audio control 


ba gengu regin oll 4 rokstdla, ginnheilég god, ok um geettuz hverr skyldi dverga drottin skepja or Brimis bId6di ok dr blam leggjum. 


Pm 0:00/0:12 


oi: 


Pann 


@ bao 
Hvad 
Dau 


<) Flashcard 6 © 


4. Adding sign language to LARA documents 


The theme of the third subproject was sign language. Here, the intention was to 
create an initial version of a LARA text designed for Deaf learners. The starting 
point was an existing LARA text for an Icelandic children’s story, Tina fer i fri; this 
story had been constructed for a previous experiment (Bédi et al., 2019), where it 
had been used by beginner/intermediate L2 learners in an Icelandic-for-foreigners 
course. In the current project, we repurposed the text so that it could be used by 
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Deaf signers of [TM who wished to strengthen their Icelandic reading skills. Like 
all sign languages, [TM has no grammatical connection with the surrounding oral/ 
aural language, here Icelandic. It is thus by no means assured that native signers 
of [TM will have strong reading skills in Icelandic, which, for them, is a second 
language. 


As with the other two subprojects, core members of the LARA team did a small 
amount of preparatory work, generalising the treatment of multimedia so that word 
and sentence annotations could be supplied in video and audio forms. The rest of 
the project was performed by the student, who had just completed a BA degree 
in [TM and translation. He created signed videos for the words and sentences in 
the text, using the online recording tool integrated with LARA, after which the 
LARA platform scripts downloaded and linked everything together to create the 
final document. 


The signed video extension was incorporated into the flashcard module developed 
during the first subproject. Examples of LARA pages and flashcards for ITM are 
shown in Figure 5 and Figure 6. This work will be presented at greater length 
elsewhere. 


Figure 5. Example of LARA document in Icelandic with {TM annotation: 
controls are similar to Figures 1 and 3; concordance pages contain 
signed video for the word in question (right); clicking on a camera icon 
opens a signed video for the preceding sentence (left) 


ad spyrja 


eke a 


1 Résa eftir pvi ad pad er Tina sem heldur henni og bi hoettir hin a8 grdta. "Hvar er B6i?” spyr Tina aftur 


ina. "bad var ekki draumur Spurdu bara Onn. @ 
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Figure 6. Example of flashcard for signed LARA text: the upper video is the 
question, and the lower video is the context 


> 0:02/0:02 


Fyrir 
Eftir 


ie 9 @ 
Lestur Flashcard 2 


5. Summary and further directions 


We have briefly described three summer projects where students without previous 
exposure to LARA extended it in different directions over a six-week period. This 
ambitious program was completed in half the time that was originally planned; 
encouraged by the successful results, we envisage further collaboration with the 
same and new collaborators. If you are interested in developing other open source 
extensions to LARA and need assistance, please feel free to contact us at the 
addresses given above. 
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